8 research outputs found

    Modeling of Multivariate Longitudinal Phenotypes in Family Genetic Studies with Bayesian Multiplicity Adjustment

    Get PDF
    Genetic studies often collect data on multiple traits. Most genetic association analyses, however, consider traits separately and ignore potential correlation among traits, partially because of difficulties in statistical modeling of multivariate outcomes. When multiple traits are measured in a pedigree longitudinally, additional challenges arise because in addition to correlation between traits, a trait is often correlated with its own measures over time and with measurements of other family members. We developed a Bayesian model for analysis of bivariate quantitative traits measured longitudinally in family genetic studies. For a given trait, family-specific and subject-specific random effects account for correlation among family members and repeated measures, respectively. Correlation between traits is introduced by incorporating multivariate random effects and allowing time-specific trait residuals to correlate as in seemingly unrelated regressions. The proposed model can examine multiple single-nucleotide variations simultaneously, as well as incorporate familyspecific, subject-specific, or time-varying covariates. Bayesian multiplicity technique is used to effectively control false positives. Genetic Analysis Workshop 18 simulated data illustrate the proposed approach\u27s applicability in modeling longitudinal multivariate outcomes in family genetic association studies

    Using Mendelian inheritance errors as quality control criteria in whole genome sequencing data set

    Get PDF
    Although the technical and analytic complexity of whole genome sequencing is generally appreciated, best practices for data cleaning and quality control have not been defined. Family based data can be used to guide the standardization of specific quality control metrics in nonfamily based data. Given the low mutation rate, Mendelian inheritance errors are likely as a result of erroneous genotype calls. Thus, our goal was to identify the characteristics that determine Mendelian inheritance errors. To accomplish this, we used chromosome 3 whole genome sequencing family based data from the Genetic Analysis Workshop 18. Mendelian inheritance errors were provided as part of the GAW18 data set. Additionally, for binary variants we calculated Mendelian inheritance errors using PLINK. Based on our analysis, nonbinary single-nucleotide variants have an inherently high number of Mendelian inheritance errors. Furthermore, in binary variants, Mendelian inheritance errors are not randomly distributed. Indeed, we identified 3 Mendelian inheritance error peaks that were enriched with repetitive elements. However, these peaks can be lessened with the inclusion of a single filter from the sequencing file. In summary, we demonstrated that erroneous sequencing calls are nonrandomly distributed across the genome and quality control metrics can dramatically reduce the number of mendelian inheritance errors. Appropriate quality control will allow optimal use of genetic data to realize the full potential of whole genome sequencing
    corecore